73 research outputs found

    Global Convergence of the (1+1) Evolution Strategy

    Full text link
    We establish global convergence of the (1+1) evolution strategy, i.e., convergence to a critical point independent of the initial state. More precisely, we show the existence of a critical limit point, using a suitable extension of the notion of a critical point to measurable functions. At its core, the analysis is based on a novel progress guarantee for elitist, rank-based evolutionary algorithms. By applying it to the (1+1) evolution strategy we are able to provide an accurate characterization of whether global convergence is guaranteed with full probability, or whether premature convergence is possible. We illustrate our results on a number of example applications ranging from smooth (non-convex) cases over different types of saddle points and ridge functions to discontinuous and extremely rugged problems

    The Planning-ahead SMO Algorithm

    Full text link
    The sequential minimal optimization (SMO) algorithm and variants thereof are the de facto standard method for solving large quadratic programs for support vector machine (SVM) training. In this paper we propose a simple yet powerful modification. The main emphasis is on an algorithm improving the SMO step size by planning-ahead. The theoretical analysis ensures its convergence to the optimum. Experiments involving a large number of datasets were carried out to demonstrate the superiority of the new algorithm

    Challenges of Convex Quadratic Bi-objective Benchmark Problems

    Full text link
    Convex quadratic objective functions are an important base case in state-of-the-art benchmark collections for single-objective optimization on continuous domains. Although often considered rather simple, they represent the highly relevant challenges of non-separability and ill-conditioning. In the multi-objective case, quadratic benchmark problems are under-represented. In this paper we analyze the specific challenges that can be posed by quadratic functions in the bi-objective case. Our construction yields a full factorial design of 54 different problem classes. We perform experiments with well-established algorithms to demonstrate the insights that can be supported by this function class. We find huge performance differences, which can be clearly attributed to two root causes: non-separability and alignment of the Pareto set with the coordinate system

    Limits of End-to-End Learning

    Full text link
    End-to-end learning refers to training a possibly complex learning system by applying gradient-based learning to the system as a whole. End-to-end learning system is specifically designed so that all modules are differentiable. In effect, not only a central learning machine, but also all "peripheral" modules like representation learning and memory formation are covered by a holistic learning process. The power of end-to-end learning has been demonstrated on many tasks, like playing a whole array of Atari video games with a single architecture. While pushing for solutions to more challenging tasks, network architectures keep growing more and more complex. In this paper we ask the question whether and to what extent end-to-end learning is a future-proof technique in the sense of scaling to complex and diverse data processing architectures. We point out potential inefficiencies, and we argue in particular that end-to-end learning does not make optimal use of the modular design of present neural networks. Our surprisingly simple experiments demonstrate these inefficiencies, up to the complete breakdown of learning

    Accelerated Linear SVM Training with Adaptive Variable Selection Frequencies

    Full text link
    Support vector machine (SVM) training is an active research area since the dawn of the method. In recent years there has been increasing interest in specialized solvers for the important case of linear models. The algorithm presented by Hsieh et al., probably best known under the name of the "liblinear" implementation, marks a major breakthrough. The method is analog to established dual decomposition algorithms for training of non-linear SVMs, but with greatly reduced computational complexity per update step. This comes at the cost of not keeping track of the gradient of the objective any more, which excludes the application of highly developed working set selection algorithms. We present an algorithmic improvement to this method. We replace uniform working set selection with an online adaptation of selection frequencies. The adaptation criterion is inspired by modern second order working set selection methods. The same mechanism replaces the shrinking heuristic. This novel technique speeds up training in some cases by more than an order of magnitude

    Coordinate Descent with Online Adaptation of Coordinate Frequencies

    Full text link
    Coordinate descent (CD) algorithms have become the method of choice for solving a number of optimization problems in machine learning. They are particularly popular for training linear models, including linear support vector machine classification, LASSO regression, and logistic regression. We consider general CD with non-uniform selection of coordinates. Instead of fixing selection frequencies beforehand we propose an online adaptation mechanism for this important parameter, called the adaptive coordinate frequencies (ACF) method. This mechanism removes the need to estimate optimal coordinate frequencies beforehand, and it automatically reacts to changing requirements during an optimization run. We demonstrate the usefulness of our ACF-CD approach for a variety of optimization problems arising in machine learning contexts. Our algorithm offers significant speed-ups over state-of-the-art training methods

    Anytime Bi-Objective Optimization with a Hybrid Multi-Objective CMA-ES (HMO-CMA-ES)

    Full text link
    We propose a multi-objective optimization algorithm aimed at achieving good anytime performance over a wide range of problems. Performance is assessed in terms of the hypervolume metric. The algorithm called HMO-CMA-ES represents a hybrid of several old and new variants of CMA-ES, complemented by BOBYQA as a warm start. We benchmark HMO-CMA-ES on the recently introduced bi-objective problem suite of the COCO framework (COmparing Continuous Optimizers), consisting of 55 scalable continuous optimization problems, which is used by the Black-Box Optimization Benchmarking (BBOB) Workshop 2016.Comment: BBOB workshop of GECCO'201

    Vehicle Shape and Color Classification Using Convolutional Neural Network

    Full text link
    This paper presents a module of vehicle reidentification based on make/model and color classification. It could be used by the Automated Vehicular Surveillance (AVS) or by the fast analysis of video data. Many of problems, that are related to this topic, had to be addressed. In order to facilitate and accelerate the progress in this subject, we will present our way to collect and to label a large scale data set. We used deeper neural networks in our training. They showed a good classification accuracy. We show the results of make/model and color classification on controlled and video data set. We demonstrate with the help of a developed application the re-identification of vehicles on video images based on make/model and color classification. This work was partially funded under the grant

    Dual SVM Training on a Budget

    Full text link
    We present a dual subspace ascent algorithm for support vector machine training that respects a budget constraint limiting the number of support vectors. Budget methods are effective for reducing the training time of kernel SVM while retaining high accuracy. To date, budget training is available only for primal (SGD-based) solvers. Dual subspace ascent methods like sequential minimal optimization are attractive for their good adaptation to the problem structure, their fast convergence rate, and their practical speed. By incorporating a budget constraint into a dual algorithm, our method enjoys the best of both worlds. We demonstrate considerable speed-ups over primal budget training methods

    Limited-Memory Matrix Adaptation for Large Scale Black-box Optimization

    Full text link
    The Covariance Matrix Adaptation Evolution Strategy (CMA-ES) is a popular method to deal with nonconvex and/or stochastic optimization problems when the gradient information is not available. Being based on the CMA-ES, the recently proposed Matrix Adaptation Evolution Strategy (MA-ES) provides a rather surprising result that the covariance matrix and all associated operations (e.g., potentially unstable eigendecomposition) can be replaced in the CMA-ES by a updated transformation matrix without any loss of performance. In order to further simplify MA-ES and reduce its O(n2)\mathcal{O}\big(n^2\big) time and storage complexity to O(nlog(n))\mathcal{O}\big(n\log(n)\big), we present the Limited-Memory Matrix Adaptation Evolution Strategy (LM-MA-ES) for efficient zeroth order large-scale optimization. The algorithm demonstrates state-of-the-art performance on a set of established large-scale benchmarks. We explore the algorithm on the problem of generating adversarial inputs for a (non-smooth) random forest classifier, demonstrating a surprising vulnerability of the classifier
    corecore